385 research outputs found
Software-Architecture Recovery from Machine Code
In this paper, we present a tool, called Lego, which recovers object-oriented software architecture from stripped binaries. Lego takes a stripped binary as input, and uses information obtained from dynamic analysis to (i) group the functions in the binary into classes, and (ii) identify inheritance and composition relationships between the inferred classes. The information obtained by Lego can be used for reengineering legacy software, and for understanding the architecture of software systems that lack documentation and source code. Our experiments show that the class hierarchies recovered by Lego have a high degree of agreement---measured in terms of precision and recall---with the hierarchy defined in the source code
Weighted Context-Free-Language Ordered Binary Decision Diagrams
Over the years, many variants of Binary Decision Diagrams (BDDs) have been
developed to address the deficiencies of vanilla BDDs. A recent innovation is
the Context-Free-Language Ordered BDD (CFLOBDD), a hierarchically structured
decision diagram, akin to BDDs enhanced with a procedure-call mechanism, which
allows substructures to be shared in ways not possible with BDDs. For some
functions, CFLOBDDs are exponentially more succinct than BDDs. Unfortunately,
the multi-terminal extension of CFLOBDDs, like multi-terminal BDDs, cannot
efficiently represent functions of type B^n -> D, when the function's range has
many different values. This paper addresses this limitation through a new data
structure called Weighted CFLOBDDs (WCFLOBDDs). WCFLOBDDs extend CFLOBDDs using
insights from the design of Weighted BDDs (WBDDs) -- BDD-like structures with
weights on edges. We show that WCFLOBDDs can be exponentially more succinct
than both WBDDs and CFLOBDDs. We also evaluate WCFLOBDDs for quantum-circuit
simulation, and find that they perform better than WBDDs and CFLOBDDs on most
benchmarks. With a 15-minute timeout, the number of qubits that can be handled
by WCFLOBDDs is 1,048,576 for GHZ (1x over CFLOBDDs, 256x over WBDDs); 262,144
for BV and DJ (2x over CFLOBDDs, 64x over WBDDs); and 2,048 for QFT (128x over
CFLOBDDs, 2x over WBDDs).Comment: 21 page
Synthesizing Specifications
Every program should always be accompanied by a specification that describes
important aspects of the code's behavior, but writing good specifications is
often harder that writing the code itself. This paper addresses the problem of
synthesizing specifications automatically. Our method takes as input (i) a set
of function definitions, and (ii) a domain-specific language L in which the
extracted properties are to be expressed. It outputs a set of
properties--expressed in L--that describe the behavior of functions. Each of
the produced property is a best L-property for signature: there is no other
L-property for signature that is strictly more precise. Furthermore, the set is
exhaustive: no more L-properties can be added to it to make the conjunction
more precise.
We implemented our method in a tool, spyro. When given the reference
implementation for a variety of SyGuS and Synquid synthesis benchmarks, spyro
often synthesized properties that that matched the original specification
provided in the synthesis benchmark
- …